Check the presentation blog post Introducing Plotly Express.
# Plotly Express is now a sublibrary of Plotly:
import plotly.express as px
Tidy Data according to Hadley Wickham: (original paper)
import pandas as pd
# Here is a "messy" dataframe (wide-form):
messy = pd.DataFrame({
'patient': ['John Smith', 'Jane Doe', 'Mary Johnson'],
'Treatment A': [None, 16, 3],
'Treatment B': [2, 11, 1],
})
messy
| patient | Treatment A | Treatment B | |
|---|---|---|---|
| 0 | John Smith | NaN | 2 |
| 1 | Jane Doe | 16.0 | 11 |
| 2 | Mary Johnson | 3.0 | 1 |
# The pandas method `.melt` can often be used to tidy the data (long-form):
tidy = messy.melt(
id_vars=['patient'],
value_vars=['Treatment A', 'Treatment B'],
var_name='Treatment',
value_name='Result',
)
tidy
| patient | Treatment | Result | |
|---|---|---|---|
| 0 | John Smith | Treatment A | NaN |
| 1 | Jane Doe | Treatment A | 16.0 |
| 2 | Mary Johnson | Treatment A | 3.0 |
| 3 | John Smith | Treatment B | 2.0 |
| 4 | Jane Doe | Treatment B | 11.0 |
| 5 | Mary Johnson | Treatment B | 1.0 |
# Once it is in tidy format, plotly express allows you to build complex interactive plots with a one-liner:
px.bar(
data_frame=tidy,
x='patient',
y='Result',
color='Treatment',
barmode='group',
title='Medical Treatment Results',
)
# Plotly express also takes messy dataframes (suitable for quick visualizations during data exploration).
px.bar(
messy,
x='patient',
y=['Treatment A', 'Treatment B'],
barmode='group',
title='Medical Treatment Results',
)
Use the example cluster data, loaded with:
import pandas as pd
table = pd.read_csv(
'https://raw.githubusercontent.com/chumo/Data2Serve/master/transition_clusters.csv')
... and convert it into a tidy dataframe (HINT: use pd.concat method). It should look like this:

table.head()
tidy = pd.concat([
table
.loc[:, ['Xi', 'Yi', 'color']]
.assign(initial=True)
.rename(columns={'Xi': 'x', 'Yi': 'y'}),
table
.loc[:, ['Xf', 'Yf', 'color']]
.assign(initial=False)
.rename(columns={'Xf': 'x', 'Yf': 'y'})],
)
tidy
| x | y | color | initial | |
|---|---|---|---|---|
| 0 | 109.360643 | 434.557514 | red | True |
| 1 | 55.957358 | 438.934136 | red | True |
| 2 | 369.115969 | 419.904538 | red | True |
| 3 | 491.392739 | 492.316412 | red | True |
| 4 | 34.286602 | 404.017801 | red | True |
| ... | ... | ... | ... | ... |
| 85 | 298.802341 | 149.868698 | blue | False |
| 86 | 322.576822 | 108.178118 | blue | False |
| 87 | 293.644438 | 130.646623 | blue | False |
| 88 | 304.092241 | 91.984033 | blue | False |
| 89 | 320.828650 | 119.716195 | blue | False |
180 rows × 4 columns
Take the tidy dataframe created in Exercise 1 and build this plot with two subplots:

import plotly.express as px
px.scatter(
data_frame=tidy,
x='x',
y='y',
color_discrete_map={'red':'red', 'green':'green', 'blue':'blue'},
color='color',
facet_row='initial',
labels={'initial': 'better'},
width=600,
height=1000,
title='Some random data',
).update_layout(
showlegend=False
)
And now making each subplot as a frame of an animated plot:

import plotly.express as px
px.scatter(
data_frame=tidy.assign(initial=tidy.initial.astype(str)),
x='x',
y='y',
color_discrete_map={'red':'red', 'green':'green', 'blue':'blue'},
color='color',
# facet_row='initial',
labels={'initial': 'better'},
width=600,
height=600,
title='Some random data',
animation_frame='initial',
).update_layout(
showlegend=False
).show()
Using the gapminder data:
gapminder = px.data.gapminder()
gapminder.head()
| country | continent | year | lifeExp | pop | gdpPercap | iso_alpha | iso_num | |
|---|---|---|---|---|---|---|---|---|
| 0 | Afghanistan | Asia | 1952 | 28.801 | 8425333 | 779.445314 | AFG | 4 |
| 1 | Afghanistan | Asia | 1957 | 30.332 | 9240934 | 820.853030 | AFG | 4 |
| 2 | Afghanistan | Asia | 1962 | 31.997 | 10267083 | 853.100710 | AFG | 4 |
| 3 | Afghanistan | Asia | 1967 | 34.020 | 11537966 | 836.197138 | AFG | 4 |
| 4 | Afghanistan | Asia | 1972 | 36.088 | 13079460 | 739.981106 | AFG | 4 |
Build the following plots:


and this animated plot:

gapminder.columns
gapminder['pop'].describe()
count 1.704000e+03 mean 2.960121e+07 std 1.061579e+08 min 6.001100e+04 25% 2.793664e+06 50% 7.023596e+06 75% 1.958522e+07 max 1.318683e+09 Name: pop, dtype: float64
import plotly.express as px
import numpy as np
px.scatter(
data_frame=gapminder.loc[gapminder.loc[:,'year'] == 1977,:],
x='gdpPercap',
log_x=True,
y='lifeExp',
size='pop',
size_max=70,
color='continent',
)
import plotly.express as px
import numpy as np
px.box(
data_frame=gapminder.loc[(gapminder.loc[:,'year'] == 1977) | (gapminder.loc[:,'year'] == 1997),:],
x='continent',
y='gdpPercap',
facet_col = 'year',
points=False,
width=1000,
height=500,
)
import plotly.express as px
px.bar(
data_frame=gapminder.loc[gapminder.loc[:,'continent']=='Europe',:].sort_values('gdpPercap'),
x='gdpPercap',
y='country',
range_x=[0,gapminder.query('continent == "Europe"').gdpPercap.max()],
category_orders={'year':list(range(1950,2010))},
width=600,
height=600,
animation_frame='year',
animation_group='country'
)
Since Pandas 0.25 it is possible to provide any backend for the .plot plotting API.
Plotly backend (see here) can be set with:
import pandas as pd
pd.options.plotting.backend = 'plotly'
pd.__version__
'1.3.4'
Then you can use the most common functionalities of Plotly Express by passing your Plotly parameters to the .plot method available to any pandas DataFrame;
tidy.plot.bar(
x='x',
y='y',
color='color',
barmode='group',
title='Random Stuff',
)